Translating Implicit Discourse Connectives Based on Cross-lingual Annotation and Alignment
نویسندگان
چکیده
Implicit discourse connectives and relations are distributed more widely in Chinese texts, when translating into English, such connectives are usually translated explicitly. Towards ChineseEnglish MT, in this paper we describe cross-lingual annotation and alignment of discourse connectives in a parallel corpus, describing related surveys and findings. We then conduct some evaluation experiments to testify the translation of implicit connectives and whether representing implicit connectives explicitly in source language can improve the final translation performance significantly. Preliminary results show it has little improvement by just inserting explicit connectives for implicit relations.
منابع مشابه
Cross-Lingual Identification of Ambiguous Discourse Connectives for Resource-Poor Language
The lack of annotated corpora brings limitations in research of discourse classification for many languages. In this paper, we present the first effort towards recognizing ambiguities of discourse connectives, which is fundamental to discourse classification for resource-poor language such as Chinese. A language independent framework is proposed utilizing bilingual dictionaries, Penn Discourse ...
متن کاملCOLING 2012 24 th International Conference on Computational Linguistics
s of invited position papers Remarks on some not so closed issues concerning discourse connectives Aravind Joshi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Penn Discourse Treebank Relations and their Potential for Language Generation Kathleen McKeown . . . . . . . . . . . . . . . . . . . . . . . . ...
متن کاملCrosslingual Annotation and Analysis of Implicit Discourse Connectives for Machine Translation
Usage of discourse connectives (DCs) differs across languages, thus addition and omission of connectives are common in translation. We investigate how implicit (omitted) DCs in the source text impacts various machine translation (MT) systems, and whether a discourse parser is needed as a preprocessor to explicitate implicit DCs. Based on the manual annotation and alignment of 7266 pairs of disc...
متن کاملDiscovery of Ambiguous and Unambiguous Discourse Connectives via Annotation Projection
We present work on tagging German discourse connectives using English training data and a German-English parallel corpus, and report first results towards a more comprehensive approach of doing annotation projection for explicit discourse relations. Our results show that (i) an approach based on a dictionary of connectives currently has advantages over a simpler approach that uses word alignmen...
متن کاملAttribution And The (Non-)Alignment Of Syntactic And Discourse Arguments Of Connectives
The annotations of the Penn Discourse Treebank (PDTB) include (1) discourse connectives and their arguments, and (2) attribution of each argument of each connective and of the relation it denotes. Because the PDTB covers the same text as the Penn TreeBank WSJ corpus, syntactic and discourse annotation can be compared. This has revealed significant differences between syntactic structure and dis...
متن کامل